How to train a neural network to classify images using the CIFAR-10 dataset
TABLE OF CONTENTS:
- Abstract
- Dataset and Libraries
- Defining device for use in model
- Data Preprocessing and Visualisation
- Model Definition
- Defining optimisers
- Defining the training loop and using the CNN
- Evaluation and Results
- Demonstrating how different hyperaparameters can be altered to change the validation accuracy
- Discussion of hyperparameter results
- How this tutorial differs from other similar tutorials
- References
ABSTRACT: This tutorial demonstrates how to use Jupyter Notebook to develop a machine learning pipeline for image classification using PyTorch. This tutorial will utilise the CIFAR-10 dataset, which consists of 10 different object categories. The topics covers within include: data preprocessing, model training, evaluation, and hyperparameter tuning. By the end of this guide, users will have a fully functional AI system capable of classifying images and understanding how different hyperparameters affect model performance.
To begin, it is important to ensure that our machine learning pipeline is able to access all appropriate libraries. To achieve this, we can import that which we require. For this tutorial, we will use the libraries Pytorch, Matplotlib and Numpy. As such, we will import them as shown below:
# Importing the required libraries
import numpy as np
import torch
import torch.nn as nn
import torch.optim as optim
import matplotlib.pyplot as plt
import torchvision
import torchvision.transforms as transforms
from torch.utils.data import DataLoader
# PyTorch Library: PyTorch Team. (2025). torch module documentation. Retrieved February 25, 2025, from https://pytorch.org/docs/stable/index.html
# Torchvision: PyTorch Team. (2025). torchvision module documentation. Retrieved February 25, 2025, from https://pytorch.org/vision/stable/index.html
# Matplotlib: Hunter, J. D. (2007). Matplotlib: A 2D Graphics Environment. Computing in Science & Engineering, 9(3), 90-95. Retrieved February 25, 2025, from https://matplotlib.org/stable/index.html
# NumPy: Harris, C. R., et al. (2020). Array programming with NumPy. Nature, 585(7825), 357–362. Retrieved February 25, 2025, from https://numpy.org/doc/stable/
Following the importing of the required libraries, it is important to give our model the ability to train on an accelerator. An accelerator is a device that can be used alongside the CPU to speed up the computation of our machine learning model.
# Step 1: Define Device
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(f'Using device: {device}')
# [Source: https://pytorch.org/docs/stable/notes/cuda.html]
# https://pytorch.org/tutorials/beginner/basics/buildmodel_tutorial.html#model-layers
Using device: cpu
The next stage of the machine learning pipeline is the loading of normalisation of the data that is to be used to train the model. This data will be ingested by the pipeline, and will be used to teach the pipeline how to separate the data into appropriate groups. In this example, we will be using image data from the CIFAR10. The pipeline will then use these images and their classifications to learn which attributes are present in each group, and will gain the ability to identify which images belong in each group based on these attributes. We will be modifying the data such that it is appropriate for utility within our network.
# Step 2: Load and Normalise Data
# Download and dataloader data from the CIFAR10 as shown in https://pytorch.org/tutorials/beginner/blitz/cifar10_tutorial.html
# We use transforms to change attributes of the data to make it appropriate for the pipeline.
data_transform = transforms.Compose([
transforms.RandomHorizontalFlip(), # Randomly flips images horizontally to introduce variation
transforms.RandomRotation(10), # Rotates images by a small angle to enhance model robustness
transforms.ColorJitter(brightness=0.2, contrast=0.2, saturation=0.2, hue=0.1), # Adjusts brightness/contrast
transforms.ToTensor(), # Converts images to tensors for PyTorch compatibility
transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)) # Normalizes pixel values to improve learning stability
])
# Loading CIFAR-10 dataset [Source: https://pytorch.org/vision/stable/datasets.html#torchvision.datasets.CIFAR10]
train_dataset = torchvision.datasets.CIFAR10(root='./data', train=True, transform=data_transform, download=True)
test_dataset = torchvision.datasets.CIFAR10(root='./data', train=False, transform=data_transform, download=True)
# DataLoader allows efficient batch loading [Source: https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader]
train_loader = DataLoader(train_dataset, batch_size=128, shuffle=True)
test_loader = DataLoader(test_dataset, batch_size=128, shuffle=False)
# https://pytorch.org/tutorials/beginner/blitz/cifar10_tutorial.html
# PyTorch Team. (2025). torch.cuda documentation. Retrieved February 25, 2025, from https://pytorch.org/docs/stable/cuda.html
Files already downloaded and verified Files already downloaded and verified
Next, we will define the Convolutional Neural Network to be used in our machine learning pipeline. As we are training our pipeline to be able to identify image data, a CNN is most approrpiate. This is due to the fact that CNNs are able to automatically capture spatial hierarchies in images, including that of edges and textures. CNNs will also be able to identify any more patterns that occur within the data set. Within this neural network, we will use multiple convolutional layers, batch normalisation, ReLU and pooling.
# Step 3: Define CNN Model with Batch Normalisation and Dropout
class CNN(nn.Module): # [Source: PyTorch Official Examples]
def __init__(self):
super(CNN, self).__init__()
# First convolutional layer extracts low-level features [Source: https://pytorch.org/docs/stable/generated/torch.nn.Conv2d.html]
self.conv1 = nn.Conv2d(3, 64, kernel_size=3, padding=1) # Increased Filters
self.bn1 = nn.BatchNorm2d(64)
# Second convolutional layer extracts mid-level features.
self.conv2 = nn.Conv2d(64, 128, kernel_size=3, padding=1)
self.bn2 = nn.BatchNorm2d(128)
# Third convolutional layer extracts deeper patterns.
self.conv3 = nn.Conv2d(128, 256, kernel_size=3, padding=1) # Added Extra Layer
self.bn3 = nn.BatchNorm2d(256)
# Max pooling reduces spatial dimensions while retaining important features. [Source: https://pytorch.org/docs/stable/generated/torch.nn.MaxPool2d.html]
self.pool = nn.MaxPool2d(2, 2)
# Dropout prevents overfitting [Source: https://pytorch.org/docs/stable/generated/torch.nn.Dropout.html]
self.dropout = nn.Dropout(0.4) # Adjusted Dropout
# Fully connected layers for classification.
self.fc1 = nn.Linear(256 * 4 * 4, 256) # Adjusted Fully Connected Layer
self.fc2 = nn.Linear(256, 10)
# Define the forward pass of the neural network
def forward(self, x):
x = self.pool(torch.relu(self.bn1(self.conv1(x)))) # Apply first convolution and ReLU activation
x = self.pool(torch.relu(self.bn2(self.conv2(x)))) # Apply second convolution and ReLU activation
x = self.pool(torch.relu(self.bn3(self.conv3(x)))) # Apply third convolution and ReLU activation
x = torch.flatten(x, 1) # Flattens feature maps into a vector for FC layers [Source: https://pytorch.org/docs/stable/generated/torch.flatten.html]
x = torch.relu(self.fc1(x))
x = self.dropout(x) # Applies dropout for regularisation [Source: https://pytorch.org/docs/stable/generated/torch.nn.Dropout.html]
x = self.fc2(x) # Final classification layer [Source: https://pytorch.org/docs/stable/generated/torch.nn.Linear.html]
return x
# Create an instance of the model and move it to the selected device.
model = CNN().to(device)
# https://medium.com/@myringoleMLGOD/simple-convolutional-neural-network-cnn-for-dummies-in-pytorch-a-step-by-step-guide-6f4109f6df80
# PyTorch Team. (2025). torchvision.transforms documentation. Retrieved February 25, 2025, from https://pytorch.org/vision/stable/transforms.html
# Normalisation values for CIFAR-10: Krizhevsky, A. (2009). Learning Multiple Layers of Features from Tiny Images. University of Toronto.
The next stage of the machine learning pipeline is to use the torch.optim package to implement different optimisation algorithms. Within this tutorial, I will be using the cross entropy loss, stochastic gradient descent and cosine annealing learning rate enhancements. Each of these optimisation algorithms can be used to increase the accuracy of the model produced. The loss function used is CrossEntropyLoss, commonly used for multi-class classification problems. We will use it within our pipeline to quantify how far the predicted values are from the true labels.
# Step 4: Define Loss, Optimiser, and Scheduler
criterion = nn.CrossEntropyLoss() # Cross-entropy loss for multi-class classification
optimiser = optim.SGD(model.parameters(), lr=0.01, momentum=0.9, weight_decay=5e-4) # Stochastic Gradient Descent with momentum
scheduler = optim.lr_scheduler.CosineAnnealingLR(optimiser, T_max=15) # Cosine Annealing Learning Rate
# [Source: https://pytorch.org/docs/stable/generated/torch.nn.CrossEntropyLoss.html]
# [Source: https://pytorch.org/docs/stable/generated/torch.optim.lr_scheduler.CosineAnnealingLR.html]
# [Source: https://pytorch.org/docs/stable/generated/torch.optim.SGD.html]
The next stage of the machine learning pipeline is to feed the data present in the dataset into our model. One way that we can alter the accuracy of the model is by changing the number of epochs that our model is trained using. Each epoch is an iteration of the entire training dataset. As such, increasing the number of epochs used to train our model will increase the ability of our model to identify the data in the data set. However, it is important to not make the number of epochs used too large, as it can lead to overfitting, which is the occurance of an AI only being able to accurately predict the training data, and innacurately deals with the validation and testing data. Within the training loop we will include the functionality of calculating and demonstrating the accuracy of the model both as text and graphically through the use of the Matplotlib library. This training loop uses backpropagation and an optimiser to teach the AI, allowing for increased performance and predictive ability. We calculate the loss of the model in two different modes, that of the training mode and the evaluation mode. This is done to test the model under separate conditions, testing how well the model deals with both seen and unseen data. This is done to measure if the model is overfitting.
# Step 5: Training Loop
def train_model(model, train_loader, test_loader, criterion, optimiser, scheduler, num_epochs=15):
model.train()
train_losses, val_losses = [], [] # Arrays used to store losses
train_accuracies, val_accuracies = [], [] # Arrays used to store calculated accuracies
epochs = []
for epoch in range(num_epochs):
running_loss, correct_train, total_train = 0.0, 0, 0 # Values that store the number of
for images, labels in train_loader: # For each image and classification within the set of training data
images, labels = images.to(device), labels.to(device) #Send the input to the device
optimiser.zero_grad() # Sets the grads to zero, increasing performance
outputs = model(images)
loss = criterion(outputs, labels) # Calculates how far the predicted values are from the true labels
loss.backward() # Backpropagation to compute gradients
optimiser.step()
running_loss += loss.item()
_, predicted = torch.max(outputs, 1)
total_train += labels.size(0)
correct_train += (predicted == labels).sum().item()
train_loss = running_loss / len(train_loader)
train_acc = 100 * correct_train / total_train
train_losses.append(train_loss)
train_accuracies.append(train_acc)
epochs.append(epoch + 1)
model.eval() # Sets the model to evaluation mode for validation
val_loss, correct_val, total_val = 0.0, 0, 0
with torch.no_grad(): # Disable gradient computation for validation
for images, labels in test_loader:
images, labels = images.to(device), labels.to(device)
outputs = model(images)
loss = criterion(outputs, labels) # Compute validation loss
val_loss += loss.item()
_, predicted = torch.max(outputs, 1)
total_val += labels.size(0)
correct_val += (predicted == labels).sum().item()
val_loss /= len(test_loader)
val_acc = 100 * correct_val / total_val
val_losses.append(val_loss)
val_accuracies.append(val_acc)
scheduler.step() # Adjust learning rate
print(f'Epoch {epoch+1}/{num_epochs} | Train Loss: {train_loss:.4f}, Train Acc: {train_acc:.2f}% | Val Loss: {val_loss:.4f}, Val Acc: {val_acc:.2f}%')
model.train() # Switch back to training mode
# Plot Training Results
plt.figure(figsize=(12, 6))
plt.subplot(1, 2, 1)
plt.plot(epochs, train_losses, label='Train Loss')
plt.plot(epochs, val_losses, label='Validation Loss')
plt.xlabel('Epochs')
plt.ylabel('Loss')
plt.title('Loss Over Epochs')
plt.legend()
plt.subplot(1, 2, 2)
plt.plot(epochs, train_accuracies, label='Train Accuracy')
plt.plot(epochs, val_accuracies, label='Validation Accuracy')
plt.xlabel('Epochs')
plt.ylabel('Accuracy (%)')
plt.title('Accuracy Over Epochs')
plt.legend()
plt.show()
return train_losses, train_accuracies, val_losses, val_accuracies
train_model(model, train_loader, test_loader, criterion, optimiser, scheduler, num_epochs=15) # Run the model
# https://pytorch.org/docs/stable/generated/torch.optim.Optimizer.zero_grad.html
# https://pytorch.org/docs/stable/tensors.html#torch.Tensor.to
# https://pytorch.org/docs/stable/generated/torch.nn.Module.html#torch.nn.Module.forward
# https://pytorch.org/docs/stable/generated/torch.nn.CrossEntropyLoss.html
# https://pytorch.org/docs/stable/autograd.html
# https://pytorch.org/docs/stable/generated/torch.optim.lr_scheduler.CosineAnnealingLR.html
# https://pytorch.org/docs/stable/generated/torch.optim.Optimizer.step.html
# https://pytorch.org/docs/stable/generated/torch.max.html
Epoch 1/15 | Train Loss: 1.5120, Train Acc: 44.61% | Val Loss: 1.3842, Val Acc: 50.04% Epoch 2/15 | Train Loss: 1.1908, Train Acc: 57.72% | Val Loss: 1.1042, Val Acc: 61.23% Epoch 3/15 | Train Loss: 1.0705, Train Acc: 62.56% | Val Loss: 1.0589, Val Acc: 62.85% Epoch 4/15 | Train Loss: 0.9675, Train Acc: 66.00% | Val Loss: 0.8656, Val Acc: 69.88% Epoch 5/15 | Train Loss: 0.8902, Train Acc: 68.84% | Val Loss: 0.8533, Val Acc: 69.80% Epoch 6/15 | Train Loss: 0.8422, Train Acc: 70.63% | Val Loss: 0.7448, Val Acc: 73.83% Epoch 7/15 | Train Loss: 0.7947, Train Acc: 72.56% | Val Loss: 0.8082, Val Acc: 72.43% Epoch 8/15 | Train Loss: 0.7455, Train Acc: 73.94% | Val Loss: 0.7633, Val Acc: 73.54% Epoch 9/15 | Train Loss: 0.7078, Train Acc: 75.57% | Val Loss: 0.6910, Val Acc: 76.16% Epoch 10/15 | Train Loss: 0.6712, Train Acc: 76.87% | Val Loss: 0.6521, Val Acc: 77.27% Epoch 11/15 | Train Loss: 0.6344, Train Acc: 77.94% | Val Loss: 0.6203, Val Acc: 78.21% Epoch 12/15 | Train Loss: 0.6052, Train Acc: 78.98% | Val Loss: 0.6227, Val Acc: 78.06% Epoch 13/15 | Train Loss: 0.5832, Train Acc: 79.70% | Val Loss: 0.5955, Val Acc: 78.96% Epoch 14/15 | Train Loss: 0.5711, Train Acc: 80.47% | Val Loss: 0.5754, Val Acc: 79.94% Epoch 15/15 | Train Loss: 0.5546, Train Acc: 80.93% | Val Loss: 0.5767, Val Acc: 79.79%
([1.512048579542838, 1.1908392604354703, 1.0704513658647952, 0.9675115708195036, 0.8902251476521992, 0.8421515398623084, 0.7946918207361265, 0.7455178170710268, 0.7078419516763419, 0.6712166821712728, 0.634432458435483, 0.6052003820686389, 0.5832140643121032, 0.5711386279224435, 0.5546144381203615], [44.612, 57.722, 62.56, 65.996, 68.836, 70.634, 72.562, 73.942, 75.566, 76.872, 77.938, 78.98, 79.696, 80.474, 80.926], [1.3841868261747723, 1.1042084618459773, 1.0589210270326348, 0.8656254976610595, 0.853308998331239, 0.744849595087993, 0.8082426763788054, 0.7632524982283387, 0.6909980739973769, 0.6520906713189958, 0.6203382151036323, 0.6226527996455566, 0.5955363478087172, 0.5753705584550206, 0.5767255289645135], [50.04, 61.23, 62.85, 69.88, 69.8, 73.83, 72.43, 73.54, 76.16, 77.27, 78.21, 78.06, 78.96, 79.94, 79.79])
RESULTS:
Epoch 1/15 | Train Loss: 1.5274, Train Acc: 44.15% | Val Loss: 1.2267, Val Acc: 55.65%
Epoch 2/15 | Train Loss: 1.2188, Train Acc: 56.37% | Val Loss: 1.0507, Val Acc: 62.21%
Epoch 3/15 | Train Loss: 1.0807, Train Acc: 61.58% | Val Loss: 0.9285, Val Acc: 67.59%
Epoch 4/15 | Train Loss: 0.9799, Train Acc: 65.62% | Val Loss: 0.9197, Val Acc: 67.58%
Epoch 5/15 | Train Loss: 0.9103, Train Acc: 68.24% | Val Loss: 0.8375, Val Acc: 70.20%
Epoch 6/15 | Train Loss: 0.8473, Train Acc: 70.48% | Val Loss: 0.8066, Val Acc: 72.24%
Epoch 7/15 | Train Loss: 0.7988, Train Acc: 72.28% | Val Loss: 0.7622, Val Acc: 73.45%
Epoch 8/15 | Train Loss: 0.7523, Train Acc: 73.98% | Val Loss: 0.7143, Val Acc: 75.20%
Epoch 9/15 | Train Loss: 0.7135, Train Acc: 75.29% | Val Loss: 0.6884, Val Acc: 76.28%
Epoch 10/15 | Train Loss: 0.6783, Train Acc: 76.52% | Val Loss: 0.6507, Val Acc: 77.45%
Epoch 11/15 | Train Loss: 0.6410, Train Acc: 77.89% | Val Loss: 0.6338, Val Acc: 78.17%
Epoch 12/15 | Train Loss: 0.6143, Train Acc: 78.80% | Val Loss: 0.6090, Val Acc: 79.02%
Epoch 13/15 | Train Loss: 0.5886, Train Acc: 79.60% | Val Loss: 0.6025, Val Acc: 79.41%
Epoch 14/15 | Train Loss: 0.5701, Train Acc: 80.37% | Val Loss: 0.5874, Val Acc: 79.74%
Epoch 15/15 | Train Loss: 0.5649, Train Acc: 80.48% | Val Loss: 0.5874, Val Acc: 79.68%
The following graph displays a graph that demonstrates how the loss by the AI model decreases as the number of epochs, and how the accuracy increases.
The following images are classified by the AI from the testing data. The images have a label associated with them placed by the AI model. The image displays the AI's ability to classify an image.
After training the AI model, we can use the model to classify images that are present within the training set. The following code graphs an image, along with the group that the model identifies it to fit within. This is the code that is used to display the images in the previous cell.
# Step 6: Visualising Predictions
def visualise_predictions(model, test_loader):
classes = ('airplane', 'automobile', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck')
model.eval() # Set model to evaluation mode
images, labels = next(iter(test_loader)) # Get a batch of test images
images, labels = images.to(device), labels.to(device)
outputs = model(images) # Forward pass to get predictions
_, preds = torch.max(outputs, 1) # Get predicted class indices
fig, axes = plt.subplots(3, 3, figsize=(8, 8)) # Create a 3x3 grid for images
axes = axes.flatten()
for i in range(9):
img = images[i].cpu().numpy().transpose((1, 2, 0)) # Convert image to NumPy and adjust dimensions
img = (img * 0.5) + 0.5 # Unnormalise image for display
axes[i].imshow(img) # Display image
axes[i].set_title(f'True: {classes[labels[i]]}\nPred: {classes[preds[i]]}') # Set title with true and predicted labels
axes[i].axis('off') # Hide axes
plt.show() # Display the figure
visualise_predictions(model, test_loader) # Call visualisation function
# https://pytorch.org/tutorials/intermediate/tensorboard_tutorial.html
# https://matplotlib.org/stable/tutorials/introductory/pyplot.html
Within the next section of the tutorial, I will discuss how different hyperparameters can be altered to affect the ability of an AI model to classify the images in the dataset. In this example, I will be changing the hyperparameters batch size and learning rate. A batch size is the number of training samples that are processed before updating the weights used by the model for predictions. A smaller batch size updates the model more frequently, which leads to faster learning but can also result in fluctuations in training, but may help the model escape local minima. The learning rate controls how much the weights of the model change during training. A higher learning rate updates the weights more aggressively, which can lead to faster convergence but may overshoot the optimal values. A low learning rate updates weights slowly, reducing the risk of overshooting but increasing training time. The following code trains the CNN model with 10 epochs, but varies the learning rates and batch sizes. Through this, we can see how changing hyperparameters can affect the validation accuracy of an AI trained under the CIFAR-10 data set.
# Step 7: Hyperparameter Effects on Model Performance
def test_hyperparameters(learning_rates, batch_sizes, epochs=10):
"""Tests different hyperparameter settings and compares performance."""
results = []
for lr in learning_rates:
for batch_size in batch_sizes:
print(f"Training with Learning Rate: {lr}, Batch Size: {batch_size}")
# Reload dataset with new batch size
train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True)
test_loader = DataLoader(test_dataset, batch_size=batch_size, shuffle=False)
# Reinitialise model, optimiser, and scheduler
model = CNN().to(device)
optimiser = optim.SGD(model.parameters(), lr=lr, momentum=0.9, weight_decay=5e-4)
scheduler = optim.lr_scheduler.CosineAnnealingLR(optimiser, T_max=epochs)
# Train model
for epoch in range(epochs):
train_losses, train_accuracies, val_losses, val_accuracies = train_model(
model, train_loader, test_loader, criterion, optimiser, scheduler, num_epochs=1
)
# Print out results per epoch
print(f"Epoch {epoch+1}/{epochs} | LR: {lr}, Batch Size: {batch_size} | Train Loss: {train_losses[-1]:.4f}, Train Accuracy: {train_accuracies[-1]:.2f}% | Validation Loss: {val_losses[-1]:.4f}, Validation Accuracy: {val_accuracies[-1]:.2f}%")
print("------------------------------------------------------")
# Store final results
results.append({
"learning_rate": lr,
"batch_size": batch_size,
"train_accuracy": train_accuracies[-1],
"val_accuracy": val_accuracies[-1]
})
# Plot how hyperparameters affect accuracy
fig, ax = plt.subplots(figsize=(8, 6))
for res in results:
ax.scatter(res["learning_rate"], res["val_accuracy"], label=f"Batch Size {res['batch_size']}", s=100)
ax.set_xlabel("Learning Rate")
ax.set_ylabel("Validation Accuracy (%)")
ax.set_title("Impact of Learning Rate and Batch Size on Model Accuracy")
ax.set_xscale("log") # Log scale for better visualization of learning rate differences
ax.set_ylim(0, 100) # Ensure accuracy is displayed between 0-100%
ax.legend()
plt.show()
# Run hyperparameter tests
test_hyperparameters(learning_rates=[0.1, 0.01, 0.001], batch_sizes=[32, 64, 128], epochs=10)
# https://machinelearningmastery.com/grid-search-hyperparameters-deep-learning-models-python-keras/
# https://towardsdatascience.com/understanding-learning-rates-and-how-it-impacts-your-deep-learning-model-1cd6b49197c3
# https://matplotlib.org/stable/tutorials/introductory/pyplot.html
Training with Learning Rate: 0.1, Batch Size: 32 Epoch 1/1 | Train Loss: 2.2083, Train Acc: 17.16% | Val Loss: 1.9620, Val Acc: 26.60%
Epoch 1/10 | LR: 0.1, Batch Size: 32 | Train Loss: 2.2083, Train Accuracy: 17.16% | Validation Loss: 1.9620, Validation Accuracy: 26.60% ------------------------------------------------------ Epoch 1/1 | Train Loss: 1.9482, Train Acc: 26.12% | Val Loss: 1.8579, Val Acc: 28.70%
Epoch 2/10 | LR: 0.1, Batch Size: 32 | Train Loss: 1.9482, Train Accuracy: 26.12% | Validation Loss: 1.8579, Validation Accuracy: 28.70% ------------------------------------------------------ Epoch 1/1 | Train Loss: 1.8886, Train Acc: 29.68% | Val Loss: 1.9356, Val Acc: 29.13%
Epoch 3/10 | LR: 0.1, Batch Size: 32 | Train Loss: 1.8886, Train Accuracy: 29.68% | Validation Loss: 1.9356, Validation Accuracy: 29.13% ------------------------------------------------------ Epoch 1/1 | Train Loss: 1.8107, Train Acc: 33.11% | Val Loss: 1.5736, Val Acc: 40.82%
Epoch 4/10 | LR: 0.1, Batch Size: 32 | Train Loss: 1.8107, Train Accuracy: 33.11% | Validation Loss: 1.5736, Validation Accuracy: 40.82% ------------------------------------------------------ Epoch 1/1 | Train Loss: 1.7205, Train Acc: 36.64% | Val Loss: 1.6229, Val Acc: 43.41%
Epoch 5/10 | LR: 0.1, Batch Size: 32 | Train Loss: 1.7205, Train Accuracy: 36.64% | Validation Loss: 1.6229, Validation Accuracy: 43.41% ------------------------------------------------------ Epoch 1/1 | Train Loss: 1.6158, Train Acc: 40.58% | Val Loss: 1.5459, Val Acc: 43.22%
Epoch 6/10 | LR: 0.1, Batch Size: 32 | Train Loss: 1.6158, Train Accuracy: 40.58% | Validation Loss: 1.5459, Validation Accuracy: 43.22% ------------------------------------------------------ Epoch 1/1 | Train Loss: 1.5247, Train Acc: 44.28% | Val Loss: 1.5468, Val Acc: 44.28%
Epoch 7/10 | LR: 0.1, Batch Size: 32 | Train Loss: 1.5247, Train Accuracy: 44.28% | Validation Loss: 1.5468, Validation Accuracy: 44.28% ------------------------------------------------------ Epoch 1/1 | Train Loss: 1.4231, Train Acc: 48.26% | Val Loss: 1.2664, Val Acc: 54.40%
Epoch 8/10 | LR: 0.1, Batch Size: 32 | Train Loss: 1.4231, Train Accuracy: 48.26% | Validation Loss: 1.2664, Validation Accuracy: 54.40% ------------------------------------------------------ Epoch 1/1 | Train Loss: 1.3183, Train Acc: 52.49% | Val Loss: 1.2189, Val Acc: 55.37%
Epoch 9/10 | LR: 0.1, Batch Size: 32 | Train Loss: 1.3183, Train Accuracy: 52.49% | Validation Loss: 1.2189, Validation Accuracy: 55.37% ------------------------------------------------------ Epoch 1/1 | Train Loss: 1.2222, Train Acc: 56.35% | Val Loss: 1.0926, Val Acc: 60.98%
Epoch 10/10 | LR: 0.1, Batch Size: 32 | Train Loss: 1.2222, Train Accuracy: 56.35% | Validation Loss: 1.0926, Validation Accuracy: 60.98% ------------------------------------------------------ Training with Learning Rate: 0.1, Batch Size: 64 Epoch 1/1 | Train Loss: 2.3233, Train Acc: 10.01% | Val Loss: 2.3047, Val Acc: 10.00%
Epoch 1/10 | LR: 0.1, Batch Size: 64 | Train Loss: 2.3233, Train Accuracy: 10.01% | Validation Loss: 2.3047, Validation Accuracy: 10.00% ------------------------------------------------------ Epoch 1/1 | Train Loss: 2.3061, Train Acc: 9.93% | Val Loss: 2.3056, Val Acc: 10.00%
Epoch 2/10 | LR: 0.1, Batch Size: 64 | Train Loss: 2.3061, Train Accuracy: 9.93% | Validation Loss: 2.3056, Validation Accuracy: 10.00% ------------------------------------------------------ Epoch 1/1 | Train Loss: 2.3059, Train Acc: 10.05% | Val Loss: 2.3040, Val Acc: 10.00%
Epoch 3/10 | LR: 0.1, Batch Size: 64 | Train Loss: 2.3059, Train Accuracy: 10.05% | Validation Loss: 2.3040, Validation Accuracy: 10.00% ------------------------------------------------------ Epoch 1/1 | Train Loss: 2.3054, Train Acc: 9.97% | Val Loss: 2.3035, Val Acc: 10.00%
Epoch 4/10 | LR: 0.1, Batch Size: 64 | Train Loss: 2.3054, Train Accuracy: 9.97% | Validation Loss: 2.3035, Validation Accuracy: 10.00% ------------------------------------------------------ Epoch 1/1 | Train Loss: 2.3052, Train Acc: 10.02% | Val Loss: 2.3037, Val Acc: 10.00%
Epoch 5/10 | LR: 0.1, Batch Size: 64 | Train Loss: 2.3052, Train Accuracy: 10.02% | Validation Loss: 2.3037, Validation Accuracy: 10.00% ------------------------------------------------------ Epoch 1/1 | Train Loss: 2.3045, Train Acc: 9.97% | Val Loss: 2.3053, Val Acc: 10.00%
Epoch 6/10 | LR: 0.1, Batch Size: 64 | Train Loss: 2.3045, Train Accuracy: 9.97% | Validation Loss: 2.3053, Validation Accuracy: 10.00% ------------------------------------------------------ Epoch 1/1 | Train Loss: 2.3041, Train Acc: 9.81% | Val Loss: 2.3046, Val Acc: 10.00%
Epoch 7/10 | LR: 0.1, Batch Size: 64 | Train Loss: 2.3041, Train Accuracy: 9.81% | Validation Loss: 2.3046, Validation Accuracy: 10.00% ------------------------------------------------------ Epoch 1/1 | Train Loss: 2.3034, Train Acc: 9.94% | Val Loss: 2.3029, Val Acc: 10.00%
Epoch 8/10 | LR: 0.1, Batch Size: 64 | Train Loss: 2.3034, Train Accuracy: 9.94% | Validation Loss: 2.3029, Validation Accuracy: 10.00% ------------------------------------------------------ Epoch 1/1 | Train Loss: 2.3030, Train Acc: 9.89% | Val Loss: 2.3027, Val Acc: 10.00%
Epoch 9/10 | LR: 0.1, Batch Size: 64 | Train Loss: 2.3030, Train Accuracy: 9.89% | Validation Loss: 2.3027, Validation Accuracy: 10.00% ------------------------------------------------------ Epoch 1/1 | Train Loss: 2.3027, Train Acc: 9.97% | Val Loss: 2.3026, Val Acc: 10.00%
Epoch 10/10 | LR: 0.1, Batch Size: 64 | Train Loss: 2.3027, Train Accuracy: 9.97% | Validation Loss: 2.3026, Validation Accuracy: 10.00% ------------------------------------------------------ Training with Learning Rate: 0.1, Batch Size: 128 Epoch 1/1 | Train Loss: 2.2797, Train Acc: 12.17% | Val Loss: 2.1170, Val Acc: 19.18%
Epoch 1/10 | LR: 0.1, Batch Size: 128 | Train Loss: 2.2797, Train Accuracy: 12.17% | Validation Loss: 2.1170, Validation Accuracy: 19.18% ------------------------------------------------------ Epoch 1/1 | Train Loss: 1.9920, Train Acc: 23.57% | Val Loss: 1.6921, Val Acc: 35.13%
Epoch 2/10 | LR: 0.1, Batch Size: 128 | Train Loss: 1.9920, Train Accuracy: 23.57% | Validation Loss: 1.6921, Validation Accuracy: 35.13% ------------------------------------------------------ Epoch 1/1 | Train Loss: 1.5737, Train Acc: 42.19% | Val Loss: 1.3433, Val Acc: 49.36%
Epoch 3/10 | LR: 0.1, Batch Size: 128 | Train Loss: 1.5737, Train Accuracy: 42.19% | Validation Loss: 1.3433, Validation Accuracy: 49.36% ------------------------------------------------------ Epoch 1/1 | Train Loss: 1.3403, Train Acc: 51.98% | Val Loss: 1.1466, Val Acc: 58.92%
Epoch 4/10 | LR: 0.1, Batch Size: 128 | Train Loss: 1.3403, Train Accuracy: 51.98% | Validation Loss: 1.1466, Validation Accuracy: 58.92% ------------------------------------------------------ Epoch 1/1 | Train Loss: 1.1782, Train Acc: 58.48% | Val Loss: 0.9989, Val Acc: 65.70%
Epoch 5/10 | LR: 0.1, Batch Size: 128 | Train Loss: 1.1782, Train Accuracy: 58.48% | Validation Loss: 0.9989, Validation Accuracy: 65.70% ------------------------------------------------------ Epoch 1/1 | Train Loss: 1.0646, Train Acc: 63.15% | Val Loss: 0.9514, Val Acc: 67.54%
Epoch 6/10 | LR: 0.1, Batch Size: 128 | Train Loss: 1.0646, Train Accuracy: 63.15% | Validation Loss: 0.9514, Validation Accuracy: 67.54% ------------------------------------------------------ Epoch 1/1 | Train Loss: 0.9479, Train Acc: 67.28% | Val Loss: 0.8156, Val Acc: 72.59%
Epoch 7/10 | LR: 0.1, Batch Size: 128 | Train Loss: 0.9479, Train Accuracy: 67.28% | Validation Loss: 0.8156, Validation Accuracy: 72.59% ------------------------------------------------------ Epoch 1/1 | Train Loss: 0.8681, Train Acc: 70.15% | Val Loss: 0.7834, Val Acc: 73.27%
Epoch 8/10 | LR: 0.1, Batch Size: 128 | Train Loss: 0.8681, Train Accuracy: 70.15% | Validation Loss: 0.7834, Validation Accuracy: 73.27% ------------------------------------------------------ Epoch 1/1 | Train Loss: 0.7948, Train Acc: 72.61% | Val Loss: 0.6873, Val Acc: 76.98%
Epoch 9/10 | LR: 0.1, Batch Size: 128 | Train Loss: 0.7948, Train Accuracy: 72.61% | Validation Loss: 0.6873, Validation Accuracy: 76.98% ------------------------------------------------------ Epoch 1/1 | Train Loss: 0.7449, Train Acc: 74.48% | Val Loss: 0.6544, Val Acc: 77.49%
Epoch 10/10 | LR: 0.1, Batch Size: 128 | Train Loss: 0.7449, Train Accuracy: 74.48% | Validation Loss: 0.6544, Validation Accuracy: 77.49% ------------------------------------------------------ Training with Learning Rate: 0.01, Batch Size: 32 Epoch 1/1 | Train Loss: 1.9135, Train Acc: 27.37% | Val Loss: 1.6193, Val Acc: 40.80%
Epoch 1/10 | LR: 0.01, Batch Size: 32 | Train Loss: 1.9135, Train Accuracy: 27.37% | Validation Loss: 1.6193, Validation Accuracy: 40.80% ------------------------------------------------------ Epoch 1/1 | Train Loss: 1.5666, Train Acc: 43.11% | Val Loss: 1.4225, Val Acc: 49.15%
Epoch 2/10 | LR: 0.01, Batch Size: 32 | Train Loss: 1.5666, Train Accuracy: 43.11% | Validation Loss: 1.4225, Validation Accuracy: 49.15% ------------------------------------------------------ Epoch 1/1 | Train Loss: 1.2889, Train Acc: 54.09% | Val Loss: 1.1556, Val Acc: 60.02%
Epoch 3/10 | LR: 0.01, Batch Size: 32 | Train Loss: 1.2889, Train Accuracy: 54.09% | Validation Loss: 1.1556, Validation Accuracy: 60.02% ------------------------------------------------------ Epoch 1/1 | Train Loss: 1.0896, Train Acc: 61.99% | Val Loss: 0.9115, Val Acc: 67.87%
Epoch 4/10 | LR: 0.01, Batch Size: 32 | Train Loss: 1.0896, Train Accuracy: 61.99% | Validation Loss: 0.9115, Validation Accuracy: 67.87% ------------------------------------------------------ Epoch 1/1 | Train Loss: 0.9627, Train Acc: 66.80% | Val Loss: 0.8325, Val Acc: 71.65%
Epoch 5/10 | LR: 0.01, Batch Size: 32 | Train Loss: 0.9627, Train Accuracy: 66.80% | Validation Loss: 0.8325, Validation Accuracy: 71.65% ------------------------------------------------------ Epoch 1/1 | Train Loss: 0.8711, Train Acc: 70.10% | Val Loss: 0.7660, Val Acc: 73.21%
Epoch 6/10 | LR: 0.01, Batch Size: 32 | Train Loss: 0.8711, Train Accuracy: 70.10% | Validation Loss: 0.7660, Validation Accuracy: 73.21% ------------------------------------------------------ Epoch 1/1 | Train Loss: 0.7935, Train Acc: 72.78% | Val Loss: 0.7098, Val Acc: 75.70%
Epoch 7/10 | LR: 0.01, Batch Size: 32 | Train Loss: 0.7935, Train Accuracy: 72.78% | Validation Loss: 0.7098, Validation Accuracy: 75.70% ------------------------------------------------------ Epoch 1/1 | Train Loss: 0.7240, Train Acc: 75.24% | Val Loss: 0.6794, Val Acc: 76.89%
Epoch 8/10 | LR: 0.01, Batch Size: 32 | Train Loss: 0.7240, Train Accuracy: 75.24% | Validation Loss: 0.6794, Validation Accuracy: 76.89% ------------------------------------------------------ Epoch 1/1 | Train Loss: 0.6797, Train Acc: 76.85% | Val Loss: 0.6325, Val Acc: 78.49%
Epoch 9/10 | LR: 0.01, Batch Size: 32 | Train Loss: 0.6797, Train Accuracy: 76.85% | Validation Loss: 0.6325, Validation Accuracy: 78.49% ------------------------------------------------------ Epoch 1/1 | Train Loss: 0.6451, Train Acc: 78.07% | Val Loss: 0.6081, Val Acc: 78.71%
Epoch 10/10 | LR: 0.01, Batch Size: 32 | Train Loss: 0.6451, Train Accuracy: 78.07% | Validation Loss: 0.6081, Validation Accuracy: 78.71% ------------------------------------------------------ Training with Learning Rate: 0.01, Batch Size: 64 Epoch 1/1 | Train Loss: 1.6292, Train Acc: 39.98% | Val Loss: 1.4117, Val Acc: 50.59%
Epoch 1/10 | LR: 0.01, Batch Size: 64 | Train Loss: 1.6292, Train Accuracy: 39.98% | Validation Loss: 1.4117, Validation Accuracy: 50.59% ------------------------------------------------------ Epoch 1/1 | Train Loss: 1.3544, Train Acc: 51.60% | Val Loss: 1.2309, Val Acc: 57.31%
Epoch 2/10 | LR: 0.01, Batch Size: 64 | Train Loss: 1.3544, Train Accuracy: 51.60% | Validation Loss: 1.2309, Validation Accuracy: 57.31% ------------------------------------------------------ Epoch 1/1 | Train Loss: 1.1870, Train Acc: 58.34% | Val Loss: 1.0298, Val Acc: 63.64%
Epoch 3/10 | LR: 0.01, Batch Size: 64 | Train Loss: 1.1870, Train Accuracy: 58.34% | Validation Loss: 1.0298, Validation Accuracy: 63.64% ------------------------------------------------------ Epoch 1/1 | Train Loss: 1.0676, Train Acc: 62.89% | Val Loss: 0.9236, Val Acc: 68.13%
Epoch 4/10 | LR: 0.01, Batch Size: 64 | Train Loss: 1.0676, Train Accuracy: 62.89% | Validation Loss: 0.9236, Validation Accuracy: 68.13% ------------------------------------------------------ Epoch 1/1 | Train Loss: 0.9725, Train Acc: 66.40% | Val Loss: 0.8506, Val Acc: 71.13%
Epoch 5/10 | LR: 0.01, Batch Size: 64 | Train Loss: 0.9725, Train Accuracy: 66.40% | Validation Loss: 0.8506, Validation Accuracy: 71.13% ------------------------------------------------------ Epoch 1/1 | Train Loss: 0.8878, Train Acc: 69.58% | Val Loss: 0.8051, Val Acc: 71.89%
Epoch 6/10 | LR: 0.01, Batch Size: 64 | Train Loss: 0.8878, Train Accuracy: 69.58% | Validation Loss: 0.8051, Validation Accuracy: 71.89% ------------------------------------------------------ Epoch 1/1 | Train Loss: 0.8276, Train Acc: 71.38% | Val Loss: 0.7414, Val Acc: 74.61%
Epoch 7/10 | LR: 0.01, Batch Size: 64 | Train Loss: 0.8276, Train Accuracy: 71.38% | Validation Loss: 0.7414, Validation Accuracy: 74.61% ------------------------------------------------------ Epoch 1/1 | Train Loss: 0.7701, Train Acc: 73.65% | Val Loss: 0.6885, Val Acc: 76.11%
Epoch 8/10 | LR: 0.01, Batch Size: 64 | Train Loss: 0.7701, Train Accuracy: 73.65% | Validation Loss: 0.6885, Validation Accuracy: 76.11% ------------------------------------------------------ Epoch 1/1 | Train Loss: 0.7240, Train Acc: 75.30% | Val Loss: 0.6541, Val Acc: 77.33%
Epoch 9/10 | LR: 0.01, Batch Size: 64 | Train Loss: 0.7240, Train Accuracy: 75.30% | Validation Loss: 0.6541, Validation Accuracy: 77.33% ------------------------------------------------------ Epoch 1/1 | Train Loss: 0.7005, Train Acc: 76.20% | Val Loss: 0.6409, Val Acc: 77.50%
Epoch 10/10 | LR: 0.01, Batch Size: 64 | Train Loss: 0.7005, Train Accuracy: 76.20% | Validation Loss: 0.6409, Validation Accuracy: 77.50% ------------------------------------------------------ Training with Learning Rate: 0.01, Batch Size: 128 Epoch 1/1 | Train Loss: 1.4956, Train Acc: 45.32% | Val Loss: 1.2078, Val Acc: 56.14%
Epoch 1/10 | LR: 0.01, Batch Size: 128 | Train Loss: 1.4956, Train Accuracy: 45.32% | Validation Loss: 1.2078, Validation Accuracy: 56.14% ------------------------------------------------------ Epoch 1/1 | Train Loss: 1.1907, Train Acc: 57.72% | Val Loss: 1.0167, Val Acc: 63.59%
Epoch 2/10 | LR: 0.01, Batch Size: 128 | Train Loss: 1.1907, Train Accuracy: 57.72% | Validation Loss: 1.0167, Validation Accuracy: 63.59% ------------------------------------------------------ Epoch 1/1 | Train Loss: 1.0544, Train Acc: 62.76% | Val Loss: 0.9443, Val Acc: 67.29%
Epoch 3/10 | LR: 0.01, Batch Size: 128 | Train Loss: 1.0544, Train Accuracy: 62.76% | Validation Loss: 0.9443, Validation Accuracy: 67.29% ------------------------------------------------------ Epoch 1/1 | Train Loss: 0.9595, Train Acc: 66.21% | Val Loss: 0.8442, Val Acc: 70.42%
Epoch 4/10 | LR: 0.01, Batch Size: 128 | Train Loss: 0.9595, Train Accuracy: 66.21% | Validation Loss: 0.8442, Validation Accuracy: 70.42% ------------------------------------------------------ Epoch 1/1 | Train Loss: 0.8757, Train Acc: 69.55% | Val Loss: 0.8423, Val Acc: 70.95%
Epoch 5/10 | LR: 0.01, Batch Size: 128 | Train Loss: 0.8757, Train Accuracy: 69.55% | Validation Loss: 0.8423, Validation Accuracy: 70.95% ------------------------------------------------------ Epoch 1/1 | Train Loss: 0.8029, Train Acc: 72.11% | Val Loss: 0.7375, Val Acc: 74.29%
Epoch 6/10 | LR: 0.01, Batch Size: 128 | Train Loss: 0.8029, Train Accuracy: 72.11% | Validation Loss: 0.7375, Validation Accuracy: 74.29% ------------------------------------------------------ Epoch 1/1 | Train Loss: 0.7496, Train Acc: 74.00% | Val Loss: 0.7292, Val Acc: 74.92%
Epoch 7/10 | LR: 0.01, Batch Size: 128 | Train Loss: 0.7496, Train Accuracy: 74.00% | Validation Loss: 0.7292, Validation Accuracy: 74.92% ------------------------------------------------------ Epoch 1/1 | Train Loss: 0.7048, Train Acc: 75.47% | Val Loss: 0.6711, Val Acc: 76.66%
Epoch 8/10 | LR: 0.01, Batch Size: 128 | Train Loss: 0.7048, Train Accuracy: 75.47% | Validation Loss: 0.6711, Validation Accuracy: 76.66% ------------------------------------------------------ Epoch 1/1 | Train Loss: 0.6664, Train Acc: 76.95% | Val Loss: 0.6476, Val Acc: 77.50%
Epoch 9/10 | LR: 0.01, Batch Size: 128 | Train Loss: 0.6664, Train Accuracy: 76.95% | Validation Loss: 0.6476, Validation Accuracy: 77.50% ------------------------------------------------------ Epoch 1/1 | Train Loss: 0.6437, Train Acc: 77.59% | Val Loss: 0.6319, Val Acc: 78.13%
Epoch 10/10 | LR: 0.01, Batch Size: 128 | Train Loss: 0.6437, Train Accuracy: 77.59% | Validation Loss: 0.6319, Validation Accuracy: 78.13% ------------------------------------------------------ Training with Learning Rate: 0.001, Batch Size: 32 Epoch 1/1 | Train Loss: 1.4927, Train Acc: 45.64% | Val Loss: 1.1625, Val Acc: 58.54%
Epoch 1/10 | LR: 0.001, Batch Size: 32 | Train Loss: 1.4927, Train Accuracy: 45.64% | Validation Loss: 1.1625, Validation Accuracy: 58.54% ------------------------------------------------------ Epoch 1/1 | Train Loss: 1.1540, Train Acc: 58.87% | Val Loss: 1.0495, Val Acc: 62.50%
Epoch 2/10 | LR: 0.001, Batch Size: 32 | Train Loss: 1.1540, Train Accuracy: 58.87% | Validation Loss: 1.0495, Validation Accuracy: 62.50% ------------------------------------------------------ Epoch 1/1 | Train Loss: 1.0157, Train Acc: 64.20% | Val Loss: 0.8907, Val Acc: 69.17%
Epoch 3/10 | LR: 0.001, Batch Size: 32 | Train Loss: 1.0157, Train Accuracy: 64.20% | Validation Loss: 0.8907, Validation Accuracy: 69.17% ------------------------------------------------------ Epoch 1/1 | Train Loss: 0.9259, Train Acc: 67.54% | Val Loss: 0.8672, Val Acc: 69.50%
Epoch 4/10 | LR: 0.001, Batch Size: 32 | Train Loss: 0.9259, Train Accuracy: 67.54% | Validation Loss: 0.8672, Validation Accuracy: 69.50% ------------------------------------------------------ Epoch 1/1 | Train Loss: 0.8544, Train Acc: 70.23% | Val Loss: 0.8120, Val Acc: 71.80%
Epoch 5/10 | LR: 0.001, Batch Size: 32 | Train Loss: 0.8544, Train Accuracy: 70.23% | Validation Loss: 0.8120, Validation Accuracy: 71.80% ------------------------------------------------------ Epoch 1/1 | Train Loss: 0.8024, Train Acc: 71.94% | Val Loss: 0.7894, Val Acc: 72.24%
Epoch 6/10 | LR: 0.001, Batch Size: 32 | Train Loss: 0.8024, Train Accuracy: 71.94% | Validation Loss: 0.7894, Validation Accuracy: 72.24% ------------------------------------------------------ Epoch 1/1 | Train Loss: 0.7540, Train Acc: 73.71% | Val Loss: 0.7252, Val Acc: 74.93%
Epoch 7/10 | LR: 0.001, Batch Size: 32 | Train Loss: 0.7540, Train Accuracy: 73.71% | Validation Loss: 0.7252, Validation Accuracy: 74.93% ------------------------------------------------------ Epoch 1/1 | Train Loss: 0.7202, Train Acc: 74.85% | Val Loss: 0.7014, Val Acc: 75.58%
Epoch 8/10 | LR: 0.001, Batch Size: 32 | Train Loss: 0.7202, Train Accuracy: 74.85% | Validation Loss: 0.7014, Validation Accuracy: 75.58% ------------------------------------------------------ Epoch 1/1 | Train Loss: 0.6941, Train Acc: 75.87% | Val Loss: 0.6888, Val Acc: 75.86%
Epoch 9/10 | LR: 0.001, Batch Size: 32 | Train Loss: 0.6941, Train Accuracy: 75.87% | Validation Loss: 0.6888, Validation Accuracy: 75.86% ------------------------------------------------------ Epoch 1/1 | Train Loss: 0.6774, Train Acc: 76.41% | Val Loss: 0.6684, Val Acc: 76.85%
Epoch 10/10 | LR: 0.001, Batch Size: 32 | Train Loss: 0.6774, Train Accuracy: 76.41% | Validation Loss: 0.6684, Validation Accuracy: 76.85% ------------------------------------------------------ Training with Learning Rate: 0.001, Batch Size: 64 Epoch 1/1 | Train Loss: 1.5638, Train Acc: 43.62% | Val Loss: 1.2799, Val Acc: 53.76%
Epoch 1/10 | LR: 0.001, Batch Size: 64 | Train Loss: 1.5638, Train Accuracy: 43.62% | Validation Loss: 1.2799, Validation Accuracy: 53.76% ------------------------------------------------------ Epoch 1/1 | Train Loss: 1.2205, Train Acc: 56.26% | Val Loss: 1.1183, Val Acc: 59.98%
Epoch 2/10 | LR: 0.001, Batch Size: 64 | Train Loss: 1.2205, Train Accuracy: 56.26% | Validation Loss: 1.1183, Validation Accuracy: 59.98% ------------------------------------------------------ Epoch 1/1 | Train Loss: 1.0814, Train Acc: 61.66% | Val Loss: 0.9655, Val Acc: 65.72%
Epoch 3/10 | LR: 0.001, Batch Size: 64 | Train Loss: 1.0814, Train Accuracy: 61.66% | Validation Loss: 0.9655, Validation Accuracy: 65.72% ------------------------------------------------------ Epoch 1/1 | Train Loss: 0.9962, Train Acc: 65.00% | Val Loss: 0.9197, Val Acc: 67.98%
Epoch 4/10 | LR: 0.001, Batch Size: 64 | Train Loss: 0.9962, Train Accuracy: 65.00% | Validation Loss: 0.9197, Validation Accuracy: 67.98% ------------------------------------------------------ Epoch 1/1 | Train Loss: 0.9255, Train Acc: 67.42% | Val Loss: 0.8587, Val Acc: 70.65%
Epoch 5/10 | LR: 0.001, Batch Size: 64 | Train Loss: 0.9255, Train Accuracy: 67.42% | Validation Loss: 0.8587, Validation Accuracy: 70.65% ------------------------------------------------------ Epoch 1/1 | Train Loss: 0.8738, Train Acc: 69.28% | Val Loss: 0.8268, Val Acc: 71.16%
Epoch 6/10 | LR: 0.001, Batch Size: 64 | Train Loss: 0.8738, Train Accuracy: 69.28% | Validation Loss: 0.8268, Validation Accuracy: 71.16% ------------------------------------------------------ Epoch 1/1 | Train Loss: 0.8350, Train Acc: 70.78% | Val Loss: 0.8207, Val Acc: 72.01%
Epoch 7/10 | LR: 0.001, Batch Size: 64 | Train Loss: 0.8350, Train Accuracy: 70.78% | Validation Loss: 0.8207, Validation Accuracy: 72.01% ------------------------------------------------------ Epoch 1/1 | Train Loss: 0.8061, Train Acc: 71.83% | Val Loss: 0.7845, Val Acc: 73.12%
Epoch 8/10 | LR: 0.001, Batch Size: 64 | Train Loss: 0.8061, Train Accuracy: 71.83% | Validation Loss: 0.7845, Validation Accuracy: 73.12% ------------------------------------------------------ Epoch 1/1 | Train Loss: 0.7811, Train Acc: 72.74% | Val Loss: 0.7645, Val Acc: 73.57%
Epoch 9/10 | LR: 0.001, Batch Size: 64 | Train Loss: 0.7811, Train Accuracy: 72.74% | Validation Loss: 0.7645, Validation Accuracy: 73.57% ------------------------------------------------------ Epoch 1/1 | Train Loss: 0.7669, Train Acc: 73.57% | Val Loss: 0.7571, Val Acc: 73.72%
Epoch 10/10 | LR: 0.001, Batch Size: 64 | Train Loss: 0.7669, Train Accuracy: 73.57% | Validation Loss: 0.7571, Validation Accuracy: 73.72% ------------------------------------------------------ Training with Learning Rate: 0.001, Batch Size: 128 Epoch 1/1 | Train Loss: 1.6948, Train Acc: 38.78% | Val Loss: 1.3722, Val Acc: 52.02%
Epoch 1/10 | LR: 0.001, Batch Size: 128 | Train Loss: 1.6948, Train Accuracy: 38.78% | Validation Loss: 1.3722, Validation Accuracy: 52.02% ------------------------------------------------------ Epoch 1/1 | Train Loss: 1.3424, Train Acc: 51.65% | Val Loss: 1.1973, Val Acc: 58.22%
Epoch 2/10 | LR: 0.001, Batch Size: 128 | Train Loss: 1.3424, Train Accuracy: 51.65% | Validation Loss: 1.1973, Validation Accuracy: 58.22% ------------------------------------------------------ Epoch 1/1 | Train Loss: 1.1941, Train Acc: 57.52% | Val Loss: 1.0809, Val Acc: 62.14%
Epoch 3/10 | LR: 0.001, Batch Size: 128 | Train Loss: 1.1941, Train Accuracy: 57.52% | Validation Loss: 1.0809, Validation Accuracy: 62.14% ------------------------------------------------------ Epoch 1/1 | Train Loss: 1.1007, Train Acc: 61.19% | Val Loss: 1.0312, Val Acc: 63.77%
Epoch 4/10 | LR: 0.001, Batch Size: 128 | Train Loss: 1.1007, Train Accuracy: 61.19% | Validation Loss: 1.0312, Validation Accuracy: 63.77% ------------------------------------------------------ Epoch 1/1 | Train Loss: 1.0322, Train Acc: 63.61% | Val Loss: 0.9862, Val Acc: 65.21%
Epoch 5/10 | LR: 0.001, Batch Size: 128 | Train Loss: 1.0322, Train Accuracy: 63.61% | Validation Loss: 0.9862, Validation Accuracy: 65.21% ------------------------------------------------------ Epoch 1/1 | Train Loss: 0.9889, Train Acc: 65.45% | Val Loss: 0.9437, Val Acc: 67.73%
Epoch 6/10 | LR: 0.001, Batch Size: 128 | Train Loss: 0.9889, Train Accuracy: 65.45% | Validation Loss: 0.9437, Validation Accuracy: 67.73% ------------------------------------------------------ Epoch 1/1 | Train Loss: 0.9509, Train Acc: 66.82% | Val Loss: 0.9251, Val Acc: 67.27%
Epoch 7/10 | LR: 0.001, Batch Size: 128 | Train Loss: 0.9509, Train Accuracy: 66.82% | Validation Loss: 0.9251, Validation Accuracy: 67.27% ------------------------------------------------------ Epoch 1/1 | Train Loss: 0.9307, Train Acc: 67.60% | Val Loss: 0.8933, Val Acc: 68.94%
Epoch 8/10 | LR: 0.001, Batch Size: 128 | Train Loss: 0.9307, Train Accuracy: 67.60% | Validation Loss: 0.8933, Validation Accuracy: 68.94% ------------------------------------------------------ Epoch 1/1 | Train Loss: 0.9124, Train Acc: 68.17% | Val Loss: 0.8845, Val Acc: 69.33%
Epoch 9/10 | LR: 0.001, Batch Size: 128 | Train Loss: 0.9124, Train Accuracy: 68.17% | Validation Loss: 0.8845, Validation Accuracy: 69.33% ------------------------------------------------------ Epoch 1/1 | Train Loss: 0.9043, Train Acc: 68.59% | Val Loss: 0.8700, Val Acc: 69.64%
Epoch 10/10 | LR: 0.001, Batch Size: 128 | Train Loss: 0.9043, Train Accuracy: 68.59% | Validation Loss: 0.8700, Validation Accuracy: 69.64% ------------------------------------------------------
RESULTS OF CHANGING HYPERPARAMETERS:
Training with Learning Rate: 0.1, Batch Size: 32
Epoch 1/1 | Train Loss: 2.3324, Train Acc: 10.08% | Val Loss: 2.3081, Val Acc: 10.00%
Epoch 1/10 | LR: 0.1, Batch Size: 32 | Train Loss: 2.3324, Train Accuracy: 10.08% | Validation Loss: 2.3081, Validation Accuracy: 10.00%
Epoch 2/10 | LR: 0.1, Batch Size: 32 | Train Loss: 2.2985, Train Accuracy: 10.73% | Validation Loss: 2.1694, Validation Accuracy: 16.61%
Epoch 3/10 | LR: 0.1, Batch Size: 32 | Train Loss: 2.0229, Train Accuracy: 23.04% | Validation Loss: 1.8731, Validation Accuracy: 32.89%
Epoch 4/10 | LR: 0.1, Batch Size: 32 | Train Loss: 1.7961, Train Accuracy: 33.93% | Validation Loss: 1.5531, Validation Accuracy: 42.11%
Epoch 5/10 | LR: 0.1, Batch Size: 32 | Train Loss: 1.6591, Train Accuracy: 39.38% | Validation Loss: 1.6598, Validation Accuracy: 42.60%
Epoch 6/10 | LR: 0.1, Batch Size: 32 | Train Loss: 1.5520, Train Accuracy: 43.71% | Validation Loss: 1.4601, Validation Accuracy: 46.39%
Epoch 7/10 | LR: 0.1, Batch Size: 32 | Train Loss: 1.4454, Train Accuracy: 48.24% | Validation Loss: 1.3077, Validation Accuracy: 53.02%
Epoch 8/10 | LR: 0.1, Batch Size: 32 | Train Loss: 1.3298, Train Accuracy: 52.58% | Validation Loss: 1.1986, Validation Accuracy: 57.32%
Epoch 9/10 | LR: 0.1, Batch Size: 32 | Train Loss: 1.2043, Train Accuracy: 57.18% | Validation Loss: 1.0654, Validation Accuracy: 62.05%
Epoch 10/10 | LR: 0.1, Batch Size: 32 | Train Loss: 1.1123, Train Accuracy: 60.42% | Validation Loss: 1.0096, Validation Accuracy: 64.23%
Training with Learning Rate: 0.1, Batch Size: 64< /u>
Epoch 1/10 | LR: 0.1, Batch Size: 64 | Train Loss: 2.3241, Train Accuracy: 10.11% | Validation Loss: 2.3065, Validation Accuracy: 10.00%
Epoch 2/10 | LR: 0.1, Batch Size: 64 | Train Loss: 2.3059, Train Accuracy: 10.06% | Validation Loss: 2.3066, Validation Accuracy: 10.00%
Epoch 3/10 | LR: 0.1, Batch Size: 64 | Train Loss: 2.3059, Train Accuracy: 9.93% | Validation Loss: 2.3087, Validation Accuracy: 10.00%
Epoch 4/10 | LR: 0.1, Batch Size: 64 | Train Loss: 2.3057, Train Accuracy: 10.02% | Validation Loss: 2.3084, Validation Accuracy: 10.00%
Epoch 5/10 | LR: 0.1, Batch Size: 64 | Train Loss: 2.3055, Train Accuracy: 9.92% | Validation Loss: 2.3044, Validation Accuracy: 10.00%
Epoch 6/10 | LR: 0.1, Batch Size: 64 | Train Loss: 2.3041, Train Accuracy: 9.96% | Validation Loss: 2.3046, Validation Accuracy: 10.00%
Epoch 7/10 | LR: 0.1, Batch Size: 64 | Train Loss: 2.3039, Train Accuracy: 9.93% | Validation Loss: 2.3032, Validation Accuracy: 10.00%
Epoch 8/10 | LR: 0.1, Batch Size: 64 | Train Loss: 2.3032, Train Accuracy: 10.23% | Validation Loss: 2.3033, Validation Accuracy: 10.00%
Epoch 9/10 | LR: 0.1, Batch Size: 64 | Train Loss: 2.2637, Train Accuracy: 11.80% | Validation Loss: 2.1033, Validation Accuracy: 16.60%
Epoch 10/10 | LR: 0.1, Batch Size: 64 | Train Loss: 2.1459, Train Accuracy: 15.89% | Validation Loss: 2.0479, Validation Accuracy: 20.01%
Training with Learning Rate: 0.1, Batch Size: 128
Epoch 1/10 | LR: 0.1, Batch Size: 128 | Train Loss: 2.2965, Train Accuracy: 11.00% | Validation Loss: 2.3040, Validation Accuracy: 10.00%
Epoch 2/10 | LR: 0.1, Batch Size: 128 | Train Loss: 2.3044, Train Accuracy: 10.08% | Validation Loss: 2.3037, Validation Accuracy: 10.00%
Epoch 3/10 | LR: 0.1, Batch Size: 128 | Train Loss: 2.3043, Train Accuracy: 9.98% | Validation Loss: 2.3041, Validation Accuracy: 10.00%
Epoch 4/10 | LR: 0.1, Batch Size: 128 | Train Loss: 2.3040, Train Accuracy: 9.99% | Validation Loss: 2.3034, Validation Accuracy: 10.00%
Epoch 5/10 | LR: 0.1, Batch Size: 128 | Train Loss: 2.3037, Train Accuracy: 10.04% | Validation Loss: 2.3031, Validation Accuracy: 10.00%
Epoch 6/10 | LR: 0.1, Batch Size: 128 | Train Loss: 2.3036, Train Accuracy: 9.69% | Validation Loss: 2.3031, Validation Accuracy: 10.00%
Epoch 7/10 | LR: 0.1, Batch Size: 128 | Train Loss: 2.3033, Train Accuracy: 9.96% | Validation Loss: 2.3032, Validation Accuracy: 10.00%
Epoch 8/10 | LR: 0.1, Batch Size: 128 | Train Loss: 2.3030, Train Accuracy: 9.81% | Validation Loss: 2.3028, Validation Accuracy: 10.00%
Epoch 9/10 | LR: 0.1, Batch Size: 128 | Train Loss: 2.3028, Train Accuracy: 9.84% | Validation Loss: 2.3027, Validation Accuracy: 10.00%
Epoch 10/10 | LR: 0.1, Batch Size: 128 | Train Loss: 2.3027, Train Accuracy: 9.91% | Validation Loss: 2.3026, Validation Accuracy: 10.00%
Training with Learning Rate: 0.01, Batch Size: 32
Epoch 1/10 | LR: 0.01, Batch Size: 32 | Train Loss: 1.8915, Train Accuracy: 29.28% | Validation Loss: 1.5282, Validation Accuracy: 44.43%
Epoch 2/10 | LR: 0.01, Batch Size: 32 | Train Loss: 1.5794, Train Accuracy: 42.68% | Validation Loss: 1.2870, Validation Accuracy: 52.23%
Epoch 3/10 | LR: 0.01, Batch Size: 32 | Train Loss: 1.2906, Train Accuracy: 54.25% | Validation Loss: 1.0769, Validation Accuracy: 62.43%
Epoch 4/10 | LR: 0.01, Batch Size: 32 | Train Loss: 1.1013, Train Accuracy: 61.50% | Validation Loss: 0.8850, Validation Accuracy: 69.03%
Epoch 5/10 | LR: 0.01, Batch Size: 32 | Train Loss: 0.9893, Train Accuracy: 66.03% | Validation Loss: 0.9128, Validation Accuracy: 68.90%
Epoch 6/10 | LR: 0.01, Batch Size: 32 | Train Loss: 0.8962, Train Accuracy: 69.31% | Validation Loss: 0.7750, Validation Accuracy: 73.52%
Epoch 7/10 | LR: 0.01, Batch Size: 32 | Train Loss: 0.8210, Train Accuracy: 71.93% | Validation Loss: 0.7520, Validation Accuracy: 74.31%
Epoch 8/10 | LR: 0.01, Batch Size: 32 | Train Loss: 0.7602, Train Accuracy: 74.19% | Validation Loss: 0.6925, Validation Accuracy: 76.55%
Epoch 9/10 | LR: 0.01, Batch Size: 32 | Train Loss: 0.7084, Train Accuracy: 75.96% | Validation Loss: 0.6407, Validation Accuracy: 77.94%
Epoch 10/10 | LR: 0.01, Batch Size: 32 | Train Loss: 0.6745, Train Accuracy: 77.18% | Validation Loss: 0.6214, Validation Accuracy: 78.36%
Training with Learning Rate: 0.01, Batch Size: 64
Epoch 1/10 | LR: 0.01, Batch Size: 64 | Train Loss: 1.6201, Train Accuracy: 40.59% | Validation Loss: 1.3318, Validation Accuracy: 51.17%
Epoch 2/10 | LR: 0.01, Batch Size: 64 | Train Loss: 1.3384, Train Accuracy: 52.57% | Validation Loss: 1.0960, Validation Accuracy: 62.41%
Epoch 3/10 | LR: 0.01, Batch Size: 64 | Train Loss: 1.1787, Train Accuracy: 58.75% | Validation Loss: 1.0231, Validation Accuracy: 64.45%
Epoch 4/10 | LR: 0.01, Batch Size: 64 | Train Loss: 1.0723, Train Accuracy: 62.80% | Validation Loss: 0.9541, Validation Accuracy: 66.5%
Epoch 5/10 | LR: 0.01, Batch Size: 64 | Train Loss: 0.9719, Train Accuracy: 66.41% | Validation Loss: 0.8621, Validation Accuracy: 70.54%
Epoch 6/10 | LR: 0.01, Batch Size: 64 | Train Loss: 0.8851, Train Accuracy: 69.59% | Validation Loss: 0.8215, Validation Accuracy: 72.38%
Epoch 7/10 | LR: 0.01, Batch Size: 64 | Train Loss: 0.8271, Train Accuracy: 71.59% | Validation Loss: 0.7494, Validation Accuracy: 74.60%
Epoch 8/10 | LR: 0.01, Batch Size: 64 | Train Loss: 0.7678, Train Accuracy: 73.64% | Validation Loss: 0.7037, Validation Accuracy: 75.75%
Epoch 9/10 | LR: 0.01, Batch Size: 64 | Train Loss: 0.7238, Train Accuracy: 75.18% | Validation Loss: 0.6538, Validation Accuracy: 77.38%
Epoch 10/10 | LR: 0.01, Batch Size: 64 | Train Loss: 0.6994, Train Accuracy: 76.17% | Validation Loss: 0.6475, Validation Accuracy: 77.86%
Training with Learning Rate: 0.01, Batch Size: 128
Epoch 1/10 | LR: 0.01, Batch Size: 128 | Train Loss: 1.5120, Train Accuracy: 44.51% | Validation Loss: 1.2368, Validation Accuracy: 54.91%
Epoch 2/10 | LR: 0.01, Batch Size: 128 | Train Loss: 1.1981, Train Accuracy: 57.36% | Validation Loss: 0.9969, Validation Accuracy: 64.50%
Epoch 3/10 | LR: 0.01, Batch Size: 128 | Train Loss: 1.0573, Train Accuracy: 62.72% | Validation Loss: 1.0177, Validation Accuracy: 64.60%
Epoch 4/10 | LR: 0.01, Batch Size: 128 | Train Loss: 0.9633, Train Accuracy: 66.25% | Validation Loss: 0.8940, Validation Accuracy: 68.65%
Epoch 5/10 | LR: 0.01, Batch Size: 128 | Train Loss: 0.8791, Train Accuracy: 69.30% | Validation Loss: 0.8151, Validation Accuracy: 71.83%
Epoch 6/10 | LR: 0.01, Batch Size: 128 | Train Loss: 0.8074, Train Accuracy: 71.67% | Validation Loss: 0.7947, Validation Accuracy: 72.46%
Epoch 7/10 | LR: 0.01, Batch Size: 128 | Train Loss: 0.7526, Train Accuracy: 73.92% | Validation Loss: 0.7104, Validation Accuracy: 75.38%
Epoch 8/10 | LR: 0.01, Batch Size: 128 | Train Loss: 0.7015, Train Accuracy: 75.70% | Validation Loss: 0.6922, Validation Accuracy: 76.22%
Epoch 9/10 | LR: 0.01, Batch Size: 128 | Train Loss: 0.6703, Train Accuracy: 76.65% | Validation Loss: 0.6507, Validation Accuracy: 77.72%
Epoch 10/10 | LR: 0.01, Batch Size: 128 | Train Loss: 0.6466, Train Accuracy: 77.70% | Validation Loss: 0.6349, Validation Accuracy: 77.96%
Training with Learning Rate: 0.001, Batch Size: 32
Epoch 1/10 | LR: 0.001, Batch Size: 32 | Train Loss: 1.4993, Train Accuracy: 45.57% | Validation Loss: 1.1905, Validation Accuracy: 57.53%
Epoch 2/10 | LR: 0.001, Batch Size: 32 | Train Loss: 1.1624, Train Accuracy: 58.47% | Validation Loss: 1.0042, Validation Accuracy: 64.78%
Epoch 3/10 | LR: 0.001, Batch Size: 32 | Train Loss: 1.0282, Train Accuracy: 63.68% | Validation Loss: 0.9523, Validation Accuracy: 65.82%
Epoch 4/10 | LR: 0.001, Batch Size: 32 | Train Loss: 0.9354, Train Accuracy: 67.09% | Validation Loss: 0.8396, Validation Accuracy: 70.59%
Epoch 5/10 | LR: 0.001, Batch Size: 32 | Train Loss: 0.8635, Train Accuracy: 69.73% | Validation Loss: 0.8283, Validation Accuracy: 71.30%
Epoch 6/10 | LR: 0.001, Batch Size: 32 | Train Loss: 0.8086, Train Accuracy: 71.74% | Validation Loss: 0.7482, Validation Accuracy: 73.76%
Epoch 7/10 | LR: 0.001, Batch Size: 32 | Train Loss: 0.7604, Train Accuracy: 73.42% | Validation Loss: 0.7644, Validation Accuracy: 73.49%
Epoch 8/10 | LR: 0.001, Batch Size: 32 | Train Loss: 0.7225, Train Accuracy: 75.00% | Validation Loss: 0.6892, Validation Accuracy: 75.97%
Epoch 9/10 | LR: 0.001, Batch Size: 32 | Train Loss: 0.6973, Train Accuracy: 75.66% | Validation Loss: 0.6887, Validation Accuracy: 75.96%
Epoch 10/10 | LR: 0.001, Batch Size: 32 | Train Loss: 0.6836, Train Accuracy: 76.48% | Validation Loss: 0.6716, Validation Accuracy: 76.76%
Training with Learning Rate: 0.001, Batch Size: 64
Epoch 1/10 | LR: 0.001, Batch Size: 64 | Train Loss: 1.5746, Train Accuracy: 42.81% | Validation Loss: 1.2964, Validation Accuracy: 53.31%
Epoch 2/10 | LR: 0.001, Batch Size: 64 | Train Loss: 1.2207, Train Accuracy: 56.48% | Validation Loss: 1.1271, Validation Accuracy: 60.63%
Epoch 3/10 | LR: 0.001, Batch Size: 64 | Train Loss: 1.0722, Train Accuracy: 62.09% | Validation Loss: 1.0046, Validation Accuracy: 64.35%
Epoch 4/10 | LR: 0.001, Batch Size: 64 | Train Loss: 0.9835, Train Accuracy: 65.41% | Validation Loss: 0.9519, Validation Accuracy: 66.81%
Epoch 5/10 | LR: 0.001, Batch Size: 64 | Train Loss: 0.9198, Train Accuracy: 67.71% | Validation Loss: 0.8874, Validation Accuracy: 68.40%
Epoch 6/10 | LR: 0.001, Batch Size: 64 | Train Loss: 0.8693, Train Accuracy: 69.62% | Validation Loss: 0.8449, Validation Accuracy: 70.72%
Epoch 7/10 | LR: 0.001, Batch Size: 64 | Train Loss: 0.8249, Train Accuracy: 71.18% | Validation Loss: 0.8041, Validation Accuracy: 71.64%
Epoch 8/10 | LR: 0.001, Batch Size: 64 | Train Loss: 0.7967, Train Accuracy: 72.31% | Validation Loss: 0.7758, Validation Accuracy: 72.98%
Epoch 9/10 | LR: 0.001, Batch Size: 64 | Train Loss: 0.7728, Train Accuracy: 73.16% | Validation Loss: 0.7602, Validation Accuracy: 73.56%
Epoch 10/10 | LR: 0.001, Batch Size: 64 | Train Loss: 0.7607, Train Accuracy: 73.73% | Validation Loss: 0.7523, Validation Accuracy: 73.85%
Training with Learning Rate: 0.001, Batch Size: 128
Epoch 1/10 | LR: 0.001, Batch Size: 128 | Train Loss: 1.6969, Train Accuracy: 38.63% | Validation Loss: 1.3976, Validation Accuracy: 50.28%
Epoch 2/10 | LR: 0.001, Batch Size: 128 | Train Loss: 1.3413, Train Accuracy: 52.17% | Validation Loss: 1.2072, Validation Accuracy: 57.36%
Epoch 3/10 | LR: 0.001, Batch Size: 128 | Train Loss: 1.1986, Train Accuracy: 57.50% | Validation Loss: 1.1069, Validation Accuracy: 61.18%
Epoch 4/10 | LR: 0.001, Batch Size: 128 | Train Loss: 1.0970, Train Accuracy: 61.35% | Validation Loss: 1.0114, Validation Accuracy: 64.73%
Epoch 5/10 | LR: 0.001, Batch Size: 128 | Train Loss: 1.0321, Train Accuracy: 63.78% | Validation Loss: 0.9865, Validation Accuracy: 65.34%
Epoch 6/10 | LR: 0.001, Batch Size: 128 | Train Loss: 0.9832, Train Accuracy: 65.44% | Validation Loss: 0.9335, Validation Accuracy: 67.72%
Epoch 7/10 | LR: 0.001, Batch Size: 128 | Train Loss: 0.9505, Train Accuracy: 66.83% | Validation Loss: 0.9144, Validation Accuracy: 68.31%
Epoch 8/10 | LR: 0.001, Batch Size: 128 | Train Loss: 0.9273, Train Accuracy: 67.71% | Validation Loss: 0.8854, Validation Accuracy: 69.35%
Epoch 9/10 | LR: 0.001, Batch Size: 128 | Train Loss: 0.9060, Train Accuracy: 68.28% | Validation Loss: 0.8712, Validation Accuracy: 70.30%
Epoch 10/10 | LR: 0.001, Batch Size: 128 | Train Loss: 0.8949, Train Accuracy: 69.05% | Validation Loss: 0.8738, Validation Accuracy: 70.05%
What do these results tell us about how changing hyperparameters can affect the learning rate of an AI model?
Learning Rate:
High Learning Rate (0.1)
Batch Size 32 shows steady improvement, reaching 64.23% accuracy. Batch Sizes 64 and 128 struggle to improve, getting stuck near 10% accuracy. A high learning rate may cause unstable updates, preventing the model from converging, especially with larger batches.
Moderate Learning Rate (0.01)
All batch sizes show consistent learning, with validation accuracy improving to ~78%. This rate allows for stable updates while still converging efficiently.
Low Learning Rate (0.001)
All batch sizes also show steady learning, but slightly lower final accuracy (~76-77%). A lower learning rate results in slower but more stable convergence.
Batch Size:
Smaller Batch Size (32)
Shows higher validation accuracy (~78.36%) than larger batches. More frequent updates help generalisation.
Larger Batch Sizes (64, 128)
Slower improvement, but still achieve ~77% accuracy. This is because of the fact that larger batches provide smoother updates but may converge to suboptimal solutions.
Conclusion
LR = 0.01 & Batch = 32 gives the best results. High LR (0.1) fails with large batches. Lower LR (0.001) stabilises training but learns slowly.
How does this tutorial differ from other similar tutorials?
This tutorial differs to other tutorials that I have referenced through the display of how different hyperparameters affect the ablity of the model to predict data accurately. The source: https://pyimagesearch.com/2021/07/19/pytorch-training-your-first-convolutional-neural-network-cnn/ is an example that I have followed throughout the development of my pipeline. This is a pro for my tutorial, as it displays how a neural network can be affected by different hyperparameters. A con for my tutorial compared to that which I referenced on pyimagesearch.com is the amount of code that is used to achieve the creation of the pipeline. The tutorial uses significantly less python code than my pipeline. Another pro of my tutorial is the implementation of a scheduler. This is an advantage compared to the python tutorial, which does not feature a scheduler. Through using a scheduler, my AI model is able to adjust the learning rate during its execution. This is an advantage as it means that the model will converge to achieve weights more appropriate for a high prediction ability. Without a scheduler, the training model can overfit its values, only being able to predict data present within the training data.
Another tutorial that I referenced during the creation of this tutorial is that of a python tutorial discussing how hyperparameters can affect the ability of an AI model to predict validation data. An advantage of my tutorial as opposed to that found on https://machinelearningmastery.com/grid-search-hyperparameters-deep-learning-models-python-keras/ is the fact that my tutorial uses Pytorch as its main framework, whereas the machine learning mastery uses Keras. This is an advantage for my tutorial as it allows for a dynamic model definition along with an increased customisability of the training loop. A disadvantage of my tutorial as opposed to that of the Keras tutorial is that it requires a higher degree of manual tuning. The Keras based python tutorial uses an automated grid search for hyperparameter tuning. Additionally, there is a steeper learning curve for PyTorch as opposed to Keras, meaning that it may be more difficult for beginners to understand Keras. A pro of this tutorial compared to that of the Keras-based tutorial is the higher degree of information which is available to users through this PyTorch tutorial. The accuracy of the AI model is assessed per epoch, as opposed to at the end of the loop as shown in the Keras pipeline.
Sources
PyTorch Library: PyTorch Team. (2025). torch module documentation. Retrieved February 25, 2025, from https://pytorch.org/docs/stable/index.htmlTorchvision: PyTorch Team. (2025). torchvision module documentation. Retrieved February 25, 2025, from https://pytorch.org/vision/stable/index.html
Matplotlib: Hunter, J. D. (2007). Matplotlib: A 2D Graphics Environment. Computing in Science & Engineering, 9(3), 90-95. Retrieved February 25, 2025, from https://matplotlib.org/stable/index.html
NumPy: Harris, C. R., et al. (2020). Array programming with NumPy. Nature, 585(7825), 357–362. Retrieved February 25, 2025, from https://numpy.org/doc/stable/
[Source: https://pytorch.org/docs/stable/notes/cuda.html]
https://pytorch.org/tutorials/beginner/basics/buildmodel_tutorial.html#model-layers
Loading CIFAR-10 dataset [Source: https://pytorch.org/vision/stable/datasets.html#torchvision.datasets.CIFAR10]
https://pytorch.org/tutorials/beginner/blitz/cifar10_tutorial.html
PyTorch Team. (2025). torch.cuda documentation. Retrieved February 25, 2025, from https://pytorch.org/docs/stable/cuda.html
PyTorch Team. (2025). torchvision.transforms documentation. Retrieved February 25, 2025, from https://pytorch.org/vision/stable/transforms.html
Normalisation values for CIFAR-10: Krizhevsky, A. (2009). Learning Multiple Layers of Features from Tiny Images. University of Toronto.
PyTorch Team. (2025). torchvision.transforms documentation. Retrieved February 25, 2025, from https://pytorch.org/vision/stable/transforms.html
Normalisation values for CIFAR-10: Krizhevsky, A. (2009). Learning Multiple Layers of Features from Tiny Images. University of Toronto.
https://pytorch.org/docs/stable/generated/torch.optim.Optimizer.zero_grad.html
https://pytorch.org/docs/stable/tensors.html#torch.Tensor.to
https://pytorch.org/docs/stable/generated/torch.nn.Module.html#torch.nn.Module.forward
https://pytorch.org/docs/stable/generated/torch.nn.CrossEntropyLoss.html
https://pytorch.org/docs/stable/autograd.html
https://pytorch.org/docs/stable/generated/torch.optim.lr_scheduler.CosineAnnealingLR.html
https://pytorch.org/docs/stable/generated/torch.optim.Optimizer.step.html
https://pytorch.org/docs/stable/generated/torch.max.html
https://pytorch.org/tutorials/intermediate/tensorboard_tutorial.html
https://matplotlib.org/stable/tutorials/introductory/pyplot.html
https://machinelearningmastery.com/grid-search-hyperparameters-deep-learning-models-python-keras/
https://matplotlib.org/stable/tutorials/introductory/pyplot.html